8 research outputs found

    Two-Stage Transfer Learning for Heterogeneous Robot Detection and 3D Joint Position Estimation in a 2D Camera Image using CNN

    Full text link
    Collaborative robots are becoming more common on factory floors as well as regular environments, however, their safety still is not a fully solved issue. Collision detection does not always perform as expected and collision avoidance is still an active research area. Collision avoidance works well for fixed robot-camera setups, however, if they are shifted around, Eye-to-Hand calibration becomes invalid making it difficult to accurately run many of the existing collision avoidance algorithms. We approach the problem by presenting a stand-alone system capable of detecting the robot and estimating its position, including individual joints, by using a simple 2D colour image as an input, where no Eye-to-Hand calibration is needed. As an extension of previous work, a two-stage transfer learning approach is used to re-train a multi-objective convolutional neural network (CNN) to allow it to be used with heterogeneous robot arms. Our method is capable of detecting the robot in real-time and new robot types can be added by having significantly smaller training datasets compared to the requirements of a fully trained network. We present data collection approach, the structure of the multi-objective CNN, the two-stage transfer learning training and test results by using real robots from Universal Robots, Kuka, and Franka Emika. Eventually, we analyse possible application areas of our method together with the possible improvements.Comment: 6+n pages, ICRA 2019 submissio

    Transfer Learning for Unseen Robot Detection and Joint Estimation on a Multi-Objective Convolutional Neural Network

    Full text link
    A significant problem of using deep learning techniques is the limited amount of data available for training. There are some datasets available for the popular problems like item recognition and classification or self-driving cars, however, it is very limited for the industrial robotics field. In previous work, we have trained a multi-objective Convolutional Neural Network (CNN) to identify the robot body in the image and estimate 3D positions of the joints by using just a 2D image, but it was limited to a range of robots produced by Universal Robots (UR). In this work, we extend our method to work with a new robot arm - Kuka LBR iiwa, which has a significantly different appearance and an additional joint. However, instead of collecting large datasets once again, we collect a number of smaller datasets containing a few hundred frames each and use transfer learning techniques on the CNN trained on UR robots to adapt it to a new robot having different shapes and visual features. We have proven that transfer learning is not only applicable in this field, but it requires smaller well-prepared training datasets, trains significantly faster and reaches similar accuracy compared to the original method, even improving it on some aspects.Comment: Regular paper submission to 2018 IEEE International Conference on Intelligence and Safety Robotics (ISR). Camera Ready pape

    Multi-Objective Convolutional Neural Networks for Robot Localisation and 3D Position Estimation in 2D Camera Images

    Full text link
    The field of collaborative robotics and human-robot interaction often focuses on the prediction of human behaviour, while assuming the information about the robot setup and configuration being known. This is often the case with fixed setups, which have all the sensors fixed and calibrated in relation to the rest of the system. However, it becomes a limiting factor when the system needs to be reconfigured or moved. We present a deep learning approach, which aims to solve this issue. Our method learns to identify and precisely localise the robot in 2D camera images, so having a fixed setup is no longer a requirement and a camera can be moved. In addition, our approach identifies the robot type and estimates the 3D position of the robot base in the camera image as well as 3D positions of each of the robot joints. Learning is done by using a multi-objective convolutional neural network with four previously mentioned objectives simultaneously using a combined loss function. The multi-objective approach makes the system more flexible and efficient by reusing some of the same features and diversifying for each objective in lower layers. A fully trained system shows promising results in providing an accurate mask of where the robot is located and an estimate of its base and joint positions in 3D. We compare the results to our previous approach of using cascaded convolutional neural networks.Comment: Ubiquitous Robots 2018 Regular paper submissio

    Robot Localisation and 3D Position Estimation Using a Free-Moving Camera and Cascaded Convolutional Neural Networks

    Full text link
    Many works in collaborative robotics and human-robot interaction focuses on identifying and predicting human behaviour while considering the information about the robot itself as given. This can be the case when sensors and the robot are calibrated in relation to each other and often the reconfiguration of the system is not possible, or extra manual work is required. We present a deep learning based approach to remove the constraint of having the need for the robot and the vision sensor to be fixed and calibrated in relation to each other. The system learns the visual cues of the robot body and is able to localise it, as well as estimate the position of robot joints in 3D space by just using a 2D color image. The method uses a cascaded convolutional neural network, and we present the structure of the network, describe our own collected dataset, explain the network training and achieved results. A fully trained system shows promising results in providing an accurate mask of where the robot is located and a good estimate of its joints positions in 3D. The accuracy is not good enough for visual servoing applications yet, however, it can be sufficient for general safety and some collaborative tasks not requiring very high precision. The main benefit of our method is the possibility of the vision sensor to move freely. This allows it to be mounted on moving objects, for example, a body of the person or a mobile robot working in the same environment as the robots are operating in.Comment: Submission for IEEE AIM 2018 conference, 7 pages, 7 figures, ROBIN group, University of Osl

    Orthorectied mosaicking of images from small-scale unmanned aerial vehicles

    No full text
    Unmanned aerial vehicles (UAVs) have been recently deployed in various civilian applications such as environmental monitoring, aerial imagery, and surveillance. Small-scale UAVs are of special interest for first responders since they can rather easily provide bird's eye view images of disaster areas. For such UAVs the number of images and the positions where to capture them are predefined due to limitations in flight time, communication bandwidth and local processing. The main goal of this thesis is to develop methods for mosaicking of individual aerial images taken from homogeneous or heterogeneous sensors on small-scale UAVs. The mosaicking of images taken in such scenarios are challenging as compared to panoramic construction or other mosaicking methods such as satellite image mosaicking. When flying with UAVs at a relatively low altitude (below 100m), non-planar objects on the ground make the feature matching and image registration more difficult. In addition, other artifacts such as dynamic scene, lens distortion, and heterogeneous sensors makes the mosaicking procedure more difficult. In this thesis we focus on producing orthorectified and incremental mosaics from low-altitude aerial images. The orthorectification is important in order to preserve the relative distances in the mosaic. On the other hand, the incremental mosaicking means to update the real-time mosaic while individual images are being added. We present two methods to construct such mosaics. The first method combines the metadata of the images such as GPS positions and the UAV orientations with the image processing techniques to construct the mosaic. The second method does not exploit any metadata and only uses the images. By this method we find and mitigate the sources of errors, in the process of incremental mosaicking, to achieve an orthorectified mosaic. Unlike some other mosaicking approaches we avoid any global optimization because of the high computational complexity. Furthermore, the global optimization methods require all images at once while in our incremental mosaicking we do not reposition any of the the previously mosaicked images. Eventually we demonstrate some novel methods for multispectral aerial imagery with thermal and visual (also referred to as RGB) cameras. We show how to register the images of different spectrums and how to improve the quality of this interspectral registration. The contribution of this part includes (i) the introduction of a feature descriptor for robustly identifying correspondences in images of different spectrums, (ii) the registration of image mosaics, and (iii) the registration based on depth maps.Keine Zusammenfassung vorhandenSaeed YahyanejadAbweichender Titel laut Ãœbersetzung der Verfasserin/des VerfassersKlagenfurt, Alpen-Adria-Univ., Diss., 2013OeBB(VLID)241036

    Estimation of situation awareness score and performance using eye and head gaze for human-robot collaboration

    No full text
    Human attention processes play a major role in the optimization of human-robot collaboration (HRC) [Huang et al. 2015]. We describe a novel methodology to measure and predict situation awareness from eye and head gaze features in real-time. The awareness about scene objects of interest was described by 3D gaze analysis using data from eye tracking glasses and a precise optical tracking system. A probabilistic framework of uncertainty considers coping with measurement errors in eye and position estimation. Comprehensive experiments on HRC were conducted with typical tasks including handover in a lab based prototypical manufacturing environment. The gaze features highly correlate with scores of standardized questionnaires of situation awareness (SART [Taylor 1990], SAGAT [Endsley 2000]) and predict performance in the HRC task. This will open new opportunities for human factors based optimization in HRC applications

    Two-Stage Transfer Learning for Heterogeneous Robot Detection and 3D Joint Position Estimation in a 2D Camera Image Using CNN

    No full text
    Collaborative robots are becoming more common on factory floors as well as regular environments, however, their safety still is not a fully solved issue. Collision detection does not always perform as expected and collision avoidance is still an active research area. Collision avoidance works well for fixed robot-camera setups, however, if they are shifted around, Eye-to-Hand calibration becomes invalid making it difficult to accurately run many of the existing collision avoidance algorithms. We approach the problem by presenting a stand-alone system capable of detecting the robot and estimating its position, including individual joints, by using a simple 2D colour image as an input, where no Eye-to-Hand calibration is needed. As an extension of previous work, a two-stage transfer learning approach is used to re-train a multi-objective convolutional neural network (CNN) to allow it to be used with heterogeneous robot arms. Our method is capable of detecting the robot in real-time and new robot types can be added by having significantly smaller training datasets compared to the requirements of a fully trained network. We present data collection approach, the structure of the multi-objective CNN, the two-stage transfer learning training and test results by using real robots from Universal Robots, Kuka, and Franka Emika. Eventually, we analyse possible application areas of our method together with the possible improvements

    Transfer Learning for Unseen Robot Detection and Joint Estimation on a Multi-Objective Convolutional Neural Network

    No full text
    A significant problem of using deep learning techniques is the limited amount of data available for training. There are some datasets available for the popular problems like item recognition and classification or self-driving cars, however, it is very limited for the industrial robotics field. In previous work, we have trained a multi-objective Convolutional Neural Network (CNN) to identify the robot body in the image and estimate 3D positions of the joints by using just a 2D image, but it was limited to a range of robots produced by Universal Robots (UR). In this work, we extend our method to work with a new robot arm - Kuka LBR iiwa, which has a significantly different appearance and an additional joint. However, instead of collecting large datasets once again, we collect a number of smaller datasets containing a few hundred frames each and use transfer learning techniques on the CNN trained on UR robots to adapt it to a new robot having different shapes and visual features. We have proven that transfer learning is not only applicable in this field, but it requires smaller well-prepared training datasets, trains significantly faster and reaches similar accuracy compared to the original method, even improving it on some aspects
    corecore